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(54) Method and system for optimally selecting a web firewall in a TCP/IP network 



(57) The present invention relies on dynamic auto- 
proxy configuration and more particularly to a method 
and system for selecting a Proxy/Socks Server accord- 
ing to some response time and availability criteria. It 
rests on a dynamic autoproxy mechanism using availa- 
bility and response time probes. It relies on probes 
retrieving well known HTML pages through each 
Proxy/Socks Server, measuring associated response 
time, detecting Proxy/Socks failures and degradation of 
response time. 

It also uses a CGI (Common Gateway Interface) pro- 
gram for dynamically creating autoproxy code (in a pre- 
ferred embodiment Javascript code) on an autoproxy 
URL (Universal resource locator) system for selecting 
said Proxy/Socks Server. 
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Description 

Technical field of the Invention 

[0001] The present invention relates to computer 
networks, and more particularly to a method and sys- 
tem in a TCP/IP network for optimally selecting a Web 
Firewall according to some response time and availabil- 
ity criteria. 

Background art 

INTERNET 

[0002] The Internet is a global network of comput- 
ers and computers networks (the "Net") . The Internet 
connects computers that use a variety of different oper- 
ating systems or languages, including UNIX, DOS, Win- 
dows, Macintosh, and others. To facilitate and allow the 
communication among these various systems and lan- 
guages, the Internet uses a language referred to as 
TCP/IP ("Transmission Control Protocol/Internet Proto- 
col"). TCP/IP protocol supports three basic applications 
on the Internet: 

transmitting and receiving electronic mail, 
logging into remote computers (the "Telnet"), and 
transferring files and programs from one computer 
to another ("FTP" or "File Transfer Protocol"). 

WORLD WIDE WEB 

[0003] With the increasing size and complexity of 
the Internet, tools have been developed to help find 
information on the network, often called navigators or 
navigation systems. Navigation systems that have been 
developed include standards such as Archie, Gopher 
and WAIS. The World Wide Web ("WWW" or "the Web") 
is a recent superior navigation system. The Web is: 

• an Internet-based navigation system, 

an information distrfoution and management sys- 
tem for the Internet, and 
a dynamic format for communicating on the Web. 

The Web seamlessly, for the use, integrates format of 
information, including still images, text, audio and video. 
A user on the Web using a graphical user interface 
("GUI", pronounced "gooey") may transparently com- 
municate with different host computers on the system, 
and different system applications (including FTP and 
Telnet), and different information formats for files and 
documents including, for example, text, sound and 
graphics. 

HYPERMEDIA 

[0004] The Web uses hypertext and hypermedia. 
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Hypertext is a subset of hypermedia and refers to com- 
puter-based "documents" in which readers move from 
one place to another in a document, or to another docu- 
ment, in a non-linear manner. To do this, the Web uses 
a client-server architecture. The Web servers enable 
the user to access hypertext and hypermedia informa- 
tion through the Web and the user's computer. (The 
user's computer is referred to as a client computer of the 
Web Server computers.) The clients send requests to 
the Web Servers, which react, search and respond. The 
Web allows client application software to request and 
receive hypermedia documents (including formatted 
text audio, video and graphics) with hypertext link capa- 
bilities to other hypermedia documents, from a Web file 
server. 

The Web, then, can be viewed as a collection of docu- 
ment files residing on Web host computers that are 
interconnected by hyperlinks using networking proto- 
cols, forming a virtual "web" that spans the Internet. 

UNIFORM RESOURCE LOCATORS 

[0005] A resource of the Internet is unambiguously 
identified by a Uniform Resource Locator (URL), which 
is a pointer to a particular resource at a particular loca- 
tion. A URL specifies the protocol used to access a 
server (e.g. HTTP, FTP,..), the name of the server, and 
the location of a file on that server. 
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[0006] Each Web page that appears on client mon- 
itors of the Web may appear as a complex document 
that integrates, for example, text, images, sounds and 
animation. Each such page may also contain hyperlinks 
to other Web documents so that a user at a client com- 
puter using a mouse may click on icons and may acti- 
vate hyperlink jumps to a new page (which is a graphical 
representation of another document file) on the same or 
a different Web server. 

[0007] A Web server is a software program on a 
Web host computer that answers requests from Web cli- 
ents, typically over the Internet. All Web use a language 
or protocol to communicate with Web clients which is 
called Hyper Text Transfer Protocol ("HTTP") . All types 
of data can be exchanged among Web servers and cli- 
ents using this protocol, including Hyper Text Markup 
Language ("HTML"), graphics, sound and video. HTML 
describes the layout, contents and hyperlinks of the 
documents and pages. Web clients when browsing : 
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convert user specified commands into HTTP GET 
requests, 

connect to the appropriate Web server to get infor- 
mation, and 

wait for a response. The response from the server 
can be the requested document or an error mes- 
sage. 
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[0008] After the document or an error message is 
returned, the connection between the Web client and 
the Web server is closed. 

[0009] First version of HTTP is a stateless protocol. 
That is with HTTP, there is no continuous connection s 
between each client and each server. The Web client 
using HTTP receives a response as HTML data or other 
data. This description applies to version 1 .0 of HTTP 
protocol, while the new version 1.1 break this barrier of 
stateless protocol by keeping the connection between io 
the server and client alive under certain conditions. 

BROWSER 

[0010] After receipt, the Web client formats and 15 
presents the data or activates an ancillary application 
such a sound player to present the data. To do this, the 
server or the client determines the various types of data 
received. The Web Client is also referred to as the Web 
Browser, since it in fact browses documents retrieved 20 
from the Web Server. | 

DOMAIN NAMES 

[0011] The host or computers names (like 25 
www.entreprise.com) are translated into numeric Inter- 
net addresses (like 194.56.78.3), and vice versa, by 
using a method called DNS ("Domain Name Service") . 
DNS is supported by network-resident servers, also 
known as domain name servers or DNS servers. 30 

INTRANET 

[0012] Some companies use the same mechanism 
as the Web to communicate inside their own corpora- 35 
tion. In this case, this mechanism is called an "Intranet". 
These companies use the same networking/transport 
protocols and locally based Web servers to provide 
access to vast amount of corporate information in a 
cohesive fashion. As this data may be private to the cor- ao 
poration, and because the members of the company still 
need to have access to public Web information, to avoid 
that people not belonging to the company can access to 
this private Intranet coming from the public Internet, 
they protect the access to their network by using a spe- 4s 
cial equipment called a Firewall. 

FIREWALL 

[0013] A Firewall protects one or more computers so 
with Internet connections from access by external com- 
puters connected to the Internet. A Firewall is a network 
configuration, usually created by hardware and soft- 
ware, that forms a boundary between networked com- 
puters within the Firewall from those outside the ss 
Firewall. The computers within the Firewall form a 
secure sub-network with internal access capabilities 
and shared resources not available from the outside 



computers. 

[0014] Often, a single machine, on which the Fire- 
wall is, allows access to both internal and external com- 
puters. Since the computer, on which the Firewall is, 
directly interacts with the Internet, strict security meas- 
ures against unwanted access from external computers 
are required. 

[001 5] A Firewall is commonly used to protect infor- 
mation such as electronic mail and data files within a 
physical building or organization site. A Firewall reduces 
the risk of intrusion by unauthorized people from the 
Internet, however, the same security measures can limit 
or require special software for those inside the Firewall 
who wish to access information on the outside. A Fire- 
wall can be configured using "Proxies" or "Socks" to 
designate access to information from each side of the 
Firewall. 

PROXY SERVER 

[0016] A HTTP Proxy is a special server that typi- 
cally runs in conjunction with Firewall software and 
allows an access to the Internet from within a Firewall. 
The Proxy Server : 

warts for a request (for example a HTTP request) 
from inside the Firewall, 

forwards the request to the remote server outside 
the Firewall, 

reads the response, and 

sends the response back to the client. 

[0017] A single computer can run multiple servers, 
each server connection identified with a port number. A 
Proxy Server, like an HTTP Server or a FTP Server, 
occupies a port. Typically, a connection uses standard- 
ized port numbers for each protocol (for example, HTTP 
= 80 and FTP = 21) . That is why an end user has to 
select a specific port number for each defined Proxy 
Server. Web Browsers usually let the end user set the 
host name and port number of the Proxy Servers in a 
customizable panel. Protocols such as HTTP, FTP, 
Gopher, WAIS, and Security can usually have desig- 
nated Proxies. Proxies are generally preferred over 
Socks for their ability to perform caching, high-level log- 
ging, and access control, because they provide a spe- 
cific connection for each network service protocol. 

SOCKS 

[0018] Socks Server (also called Socks Gateway) is 
also a software that allows computers inside a Firewall 
to gain access to the Internet. Socks is usually installed 
on a server positioned either inside or on the Firewall. 
Computers within the Firewall access the Socks Server 
as clients to reach the Internet. Web Browsers usually 
let the end user set the host name and port number of 
the Socks hosts (servers) in a customizable panel. On 
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some Operating Systems, the host is specified in a sep- 
arate file (e.g. socks.conf file) . As the Socks Server acts 
a a layer underneath the protocols (HTTP, FTP, ..), it 
cannot cache data (as Proxy does), because it doesnl 
decode the protocol to know what kind of data, it is 
transferring. 

OPTIONS 

[0019] The Web Browser often proposes the end 
user to select between the different options "No Prox- 
ies". "Manual Proxy Configuration'', or "Automatic Proxy 
Configuration" to designate the conduit between his 
computer and the Internet. 

Users with a direct connection to the Internet 
should use the default, which is "No Proxies". 
If the Intranet is protected by one or several Fire- 
walls, the end user may : 

• select one of these Firewalls as the elected 
Proxy, by entering its host name into the "Man- 
ual Proxy Configuration", or 

• automatically refers to the enterprise policy in 
terms of Proxies attribution between locations, 
by pointing to a common configuration file in a 
remote server. This is done by choosing the 
"Automatic Proxy Configuration" and by provid- 
ing the Web Browser with the unique address 
of the common configuration file ("Universal 
Resource Locator" or "URL") located in the 
remote server. 

[0020] Today, most of the Web Browsers are config- 
ured to forward all requests, even requests for internal 
hosts, through the Socks Firewall. So when the end 
user wants to have access to an internal Web-based 
application, his request travels to the Firewall, and is 
then reflected back into the internal network. This sends 
internal traffic on a long path, puts extra load on the 
Firewall and on the network, and worst of all, slows 
down the response time the end user sees from the 
applications and Web pages he is trying to access. This 
is called "non flexible" Socks access (when everything 
goes via the Socks Server). 

MANUAL PROXY CONFIGURATION 

[0021] The Manual Proxy configuration in the Web 
Browser is simple to process, but its main drawback is 
that the Firewall (or Proxy) selection is then static. There 
is no dynamic criterion for the Firewall selection, such 
as selection of the Firewall provicfing the best response 
time. Firewall failures require a manual reconfiguration 
of the navigation software to point to another active Fire- 
wall, since the manual configuration usually only allows 
the definition of one single Firewall per protocol with no 
possibility to pre-conf igure a backup Firewall. In addition 



to the manual proxy configuration in the Web Browser, 
external procedures can be used to provide some kind 
of robustness in the Firewall selection. They rely for 
instance on the use of multiple Firewalls having the 

5 same name defined as aliases in the Domain Name 
Server (DNS). But this technique based on alias defini- 
tion still has drawbacks since for instance the DNS is 
not always contacted for name resolution by Web Cli- 
ents caching locally the name resolution. Other tech- 

10 niques using external hardware equipment such as load 
and request dispatcher provide more robustness and 
load balancing, but still have drawbacks such as the 
need for additional and costly hardware. 

75 AUTOMATIC PROXY CONFIGURATION 

, [0022] Automatic Proxy Configuration (or also 
j referred to as "autoproxy") can set the location of the 
: HTTP, FTP, and Gopher Proxy every time the Web 
20 ' Browser is started. An autoproxy retrieves a file of 
address ranges and instructs the Web Browser to either 
directly access internal IBM hosts or to go to the Socks 
Server to access hosts on the Internet. 
[0023] Automatic Proxy Configuration is more desir- 
25 able than simple Proxy Server Configuration in the Web 
Browser, because much more sophisticated rules can 
be implemented about the way Web pages are retrieved 
: (directly or indirectly) . Automatic Proxy Configuration is 
: useful to users, because the Web Browser knows how 
30 j to retrieve pages directly if the Proxy Server fails. Also 
' Proxy requests can be directed to another or multiple 
Proxy Servers at the discretion of the system adminis- 
trator, without the end user has to make any additional 
j changes to his Web Browser configuration. In general, 
35 1 these Proxy configuration files (also called autoproxy 
! code) are usually written in Javascript language. Auto- 
: proxy facility can also contain a file of address ranges 
for instructing the Web Browser to either directly access 
internal hosts or to go to the Socks Server to access 
40 hosts on the Internet. The Socks Server protects the 
internal network from unwanted public access while 
permitting access of network members to the Internet. 
One of the drawbacks of this "autoproxy" mechanism is 
that there is no proactive Firewall failure detection nor 
45 response time consideration. 

[0024] More explanations about the domain pre- 
sented in the above sections can be found in the follow- 
ing publications incorporated herewith by reference: 

so • "Java Network Programming" by Elliotte Rusty 
Harold, published by O'Reilly, February 1997. 
"Internet in a nutshell" by Valerie Quercia, pub- 
lished by O'Reilly, October 1997. 
"Building Internet Firewalls" by Brent Chapman and 

55 Elizabeth Zwichky, published by O'Reilly. Septem- 
ber 1995. 
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PROBLEM 

[0025] The problem to solve is to provide an opti- 
mized Web access, with a dynamic Proxy or Socks 
Server selection to get the best response time, and a 
detection of failures in Proxy or Socks Server to prevent 
Web service disruption. The current solutions address 
this problem partially: 

• Web Browsers can be manually configured with the 
target Proxy or Socks Server. The main drawbacks 
of this solution are the following : 

• There is no dynamic Proxy/Socks Server 
selection. A manual reconfiguration of the Web 
Browser upon Proxy/Socks Server failure is 
required. 

• Only a "manual" load balancing through the 
Web Browser static configuration is provided. 

• Proxy/Socks Server names must be known 
and manually configured by end users. 

• j Web Browsers can be configured with their auto- 
| proxy feature, using a static list of target 

Proxy/Socks Servers downloaded from a dedicated 
autoproxy URL (Uniform Resource Locator) sys- 
tem. The main drawback of this solution is the fol- 
lowing: 

• There is no response time consideration in 
the Proxy/Socks Server selection, nor efficient 
Proxy/Socks Server failure detection (i.e. Web 
Browser waits for time-out before switching to 

. backup, even at initial autoproxy loading) in the 
i Proxy/socks Server selection. 

i [0026] An alternate to these current solutions is to 
cluster the Proxy/Socks Servers using an external dis- 
, patcher system acting as single logical access point All 
Web Browsers are then manually configured with the 
name of that external dispatcher system (as the target 
Proxy/Socks Server) which then routes the traffic to a 
selected Proxy/Socks Server. An example of such a dis- 
patcher is for example the IBM Interactive Network Dis- 
patcher product. More information concerning this 
product can be found in IBM's publication entitled "Inter- 
active Network Dispatcher V1.2 - User's Guide" GC31- 
8496-01 incorporated herewith by reference. Although a 
dispatcher oriented solution allows an efficient load bal- 
ancing in most cases, its main drawback is that addi- 
tional dedicated system or specific hardware is 
required, and that the external dispatcher name has to 
be manually configured by the end users in their Web 
Browsers. 



Objects of the Invention 
[0027] 

5 • The object of the present invention is to optimize 
Proxy/Socks Server selection by using availability 
and response time criteria. 

It is a further object of the present invention to opti- 
mize the Web service performance by integrating a 
10 response time factor to the Proxy/Socks Server 
selection. 

It is another object of the present invention to mini- 
mize Web service interruption and thus to insure a 
better service availability by automatically detecting 
15 Proxy/Socks Servers failures. 

Summary of the invention 

[0028] The present invention relates to dynamic 

20 autoproxy configuration and more particularly to a 
method and system for optimizing selection of a 
Proxy/Socks Server according to some response time 
and availability criteria. The invention rests on a 
dynamic autoproxy mechanism using availability and 

25. response time probes. 

[0029] The present invention also relies on probes 
retrieving well known HTML pages through each 
Proxy/Socks Server, measuring associated response 
time, detecting Proxy/Socks failures and degradation of 

30 response time. 

[0030] The present invention also uses a CGI 
(Common Gateway Interface) program for dynamically 
creating autoproxy code (in a preferred embodiment 
Javascript code) on an autoproxy URL (Universal 

35 Resource Locator) system for selecting the 
Proxy/Socks Server using availability and response 
time information provided by probes. 
[0031 ] The present invention fixes the drawbacks of 
the existing current solutions by integrating dynamic 

40 Proxy/Socks availability and response time selection 
criteria to the autoproxy mechanism. 
[0032] The present inventions provides the follow- 
ing advantages: 

45 • Early detection of Proxy/Socks Servers failures pro- 
vides a high Web service availability. 
Integration of a response time factor to the 
Proxy/Socks Server selection optimizes the Web 
service performances. 

so • Induced HTTP survey traffic is minimized by run- 
ning availability and response time probes from a 
single autoproxy URL system (compared with run- 
ning the probes on each Web Browser system). 
Integration of response time degradation in the 

55 probes achieves a proactive Proxy/Socks Servers 
failure detection. 

Periodical dynamic update of "best" Proxy/Socks 
Server can be provided to Web Browser. 
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Useless traffic to failing Proxy/Socks Server is min- 
imized since Proxy/Socks Servers are excluded 
from list of available target servers upon failure 
detection. 

No additional or specific hardware is required. 

• , Ease of Web Browser configuration provided to 
( mobile users (Web Browser is configured once). 

• i Web Browser performances are not degraded 

because availability and response time probes are 
not processed within the downloaded autoproxy 
code (Javascript code) but in the autoproxy URL 
system. 

Drawings 

[0033] The novel and inventive features believed 
characteristics of the invention are set forth in the 
appended claims. The invention itself, however, as well 
as a preferred mode of use, further objects and advan- 
tages thereof, will best be understood by reference to 
the following detailed description of an illustrative 
detailed embodiment when read in conjunction with the 
accompanying drawings, wherein : 

Figure 1 is a general logical view of an end user 
system interfacing a Web Browser for accessing the 
World Wide Web according to prior art. 

Figure 2 is a general physical view of the set-up 
shown in Figure 1, according to prior art 

Figure 3 is a logical view of the availability and 
response time probes external flows according to 
the present invention. 

Figure 4 is a flow chart showing the internal logic 
flow of the availability and response time probe 
introduced in Figure 3 according the present inven- 
tion. 

Figure 5 is a physical view of the logical environ- 
ment described in Figure 3 according to the present 
invention. 

Figure 6 is a view of the data flows associated with 
the entities depicted in Figure 5, according to the 
present invention. 

Figure 7 depicts the storage of the availability and 
response times probes measurements, according 
to the present invention. 

Figure 8 is a flow chart of the program running on 
the autoproxy URL (Universal Resource Locator) 
system, according to the present invention. 



Preferred embodiment of the invention 

[0034] The present invention relies on dynamic 
autoproxy configuration and more particularly to a 

5 method and system for selecting a Proxy/Socks Server 
according to some response time and availability crite- 
ria, ft rests on a dynamic autoproxy mechanism using 
availability and response time probes. It relies on 
probes retrieving well known HTML pages through each 

10 Proxy/Socks Server, measuring associated response 
time, detecting Proxy/Socks failures and degradation of 
response time. 

It also uses a CGI (Common Gateway Interface) pro- 
gram for dynamically creating autoproxy code (in a pre- 
15 ferred embodiment Javascript code) on an autoproxy 
URL (Universal resource locator) system for selecting 
said Proxy/Socks Server. 

LOGICAL VIEW OF A END USER ACCESSING THE 
20 WORLD WIDE WEB 

[0035] Figure 1 shows a user system with a user 
interface (102) running a program known as a Web 
, Browser (101) which enables access to the World- 

25 Wide-Web (WWW). The WWW content is transferred 
using the HTTP protocol. HTTP requests and 
responses are going to and from the Web Browser pro- 
gram (101) and a destination Web Server (103) contain- 
ing the WWW content the user tries to access. The 

so Firewall (104) between the Web Browser (101) and the 
Web Server (103) acts as an intermediary HTTP Proxy 
forwarding the HTTP requests and responses to their 
destination. The Web Browser program (101) makes an 
HTTP request to the Firewall (104) and the Firewall for- 

35 wards the request to the destination Web Server (103) . 
The flow in the reverse direction is the HTTP response 
which again goes via the Firewall (104) on its way to the 
Web Browser (101) . In this way the Firewall can limit 
the traffic to the transactions it is configured to allow 

40 (based on some defined security and access control 
policy). The Firewall hence protects the network where 
Web Browser is located. 

GENERAL PHYSICAL VIEW OF AN END USER 
45 ACCESSING THE WWW 

[0036] Figure 2 is a physical view of the set-up 
shown logically in Figure 1. In this particular example, 
the Web Browser (201) runs on a system attached to an 

so Intranet (202) . The Firewalls (203) that protect the 
Intranet attach both the (private) Intranet (202) and the 
(public) Internet (204) . the destination Web Server 
(205) also connects to the Internet. This is the environ- 
ment where the Web Browser, Firewalls, and Web 

55 Server perform their function when the user is "brows- 
ing" the Internet WWW. ft is important to note the fact 
the Firewalls attach two networks and hence are able to 
act as the intermecfiary for communications between the 
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two networks. Multiple Firewalls are often uses in order 
to provide some degree of access robustness and load 
sharing. 

LOGICAL VIEW OF AVAILABILITY AND RESPONSE 
TIME PROBES 

[0037] The domain of the invention is the one 
described in Figure 1 and Figure 2, where a user within 
an Intranet wants to access the World-Wide-Web using 
a Web Browser, and where the Intranet network is pro- 
tected from the Internet by several Firewalls playing role 
of so called HTTP Proxies (Figure 2). The issue is to 
select the "best" Proxy/Socks Server to insure an opti- 
mized availability and response time of the service to 
the end user. To automatically optimize this selection, a 
software component called a "WWW availability and 
response time probe" is introduced. Its role is to provide 
selection criteria. As shown in Figure 3, this data is 
gathered by measuring the response time for request- 
ing a specific content of a well known Web Server. The 
induced HTTP survey traffic is minimized, by running 
the availability and response time probes from single 
autoproxy URL system (versus running the probes on 
each Web Browser client system). 
[0038] Figure 3 demonstrates the function of a flex- 
ible WWW availability and response time probe and the 
way H can be used to gather measurements on the 
availability and response time of both HTTP Proxies and 
Socks Servers. The upper part of Figure 3 details the 
interaction of the probe with an HTTP Proxy Server 
(304) . The client system (302) that runs the probe (con- 
figured to test proxies) basically requests Web content 
(page) from the Web Server (307) via the Proxy Server 
(304) similar to the process shown in Figure 1. The 
HTTP request in this case represents an "HTTP survey 
flow" (303) to the Proxy Server. The Proxy Server for- 
wards (306) the request to the Web Server (via the Fire- 
wall (305) which is not depicted) . The client system 
times how long the request/response HTTP survey flow 
takes and uses this information as a measurement of 
the response time and availability via the tested Proxy 
Server (for the Web content samples that were tested) . 
If the client system is also the autoproxy URL system 
(301) then this measurement information for each Proxy 
Server can be used to work out a sense of the "best" 
Proxy Server to use. This can then be encoded in the 
autoproxy URL that the Web Browser programs use to 
work out their correct Proxy Server to use. 
[0039] The tower portion of the Figure 3 shows a 
similar arrangement but in this case the measurements 
data is being gathered for a Socks Server (Gateway) 
access method. Again a client probe (309) makes an 
HTTP request that represents an "HTTP survey flow" 
(310) which travels via the Socks Server (31 1) and then 
onto (312) the destination Web site (313) . This HTTP 
request is for a set target URL (308) that is known to 
exist on the target Web Server. Again it is the timing of 
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how long this survey takes that provides the measure- 
ments data that can be used to generate an autoproxy 
URL that takes into account the relative performance of 
a set of Socks Server (or in the case above, HTTP Prox- 
5 «es). 

[0040] Obviously if there is no response to the 
HTTP survey flow, then the particular Proxy or Socks 
Server being tested can be marked as unavailable. In 
this way the autoproxy URL can be used to not select 
10 Proxy or Socks Servers that do not work . 

INTERNAL LOGIC OF THE AVAILABILITY AND 
RESPONSE TIME PROBES 

is [0041 ] The internal mechanism of the probe itself is 
described in Figure 4. The probe simulates a Web Cli- 
ent, by requesting through an HTTP connection a Web 
page from a target URL through the target Proxy/Socks 
Server (using its host name and port as a reference) . 

20 The Web page is retrieved either through a normal 
HTTP connection, or through a socksified flow (a flow 
through a Socks Server) . Typically, normal flow is used 
to retrieve a Web page from a Proxy Server or from a 
Web Server, while socksified flow is used to retrieve a 

25 Web page through a Socks Server. Then, the probe 
basically checks that the Web page : 

is received within an allowed amount of time in sec- 
onds, and 

30 

contains a specific keyword to make sure that the 
received page is correct. 

When these two conditions are fulfilled, the Web page 

35 retrieval is successful. 

[0042] Finally, the probe returns either the associ- 
ated response time in seconds (successful retrieval) or 
a failure return code. This mechanism retrieves one or 
multiple target Web pages. When multiple Web pages 

40 are retrieved, the probe program sequentially tests each 
Web page until one Web page retrieval is successful or 
all Web page retrievals fail. Probes : 

retrieve well known HTML pages through each 
45 Proxy/Socks Server, 

measure associated response time, and also 
detect Proxy/Socks Server failures and response 
time degradation. 

so [0043] Figure 4 is a flow chart showing the internal 
logic flow of the WWW availability and response time 
probe introduced in Figure 3. 

The first thing the probe program does is to start a 
55 timer (401). 

Next the probe program attempts to establish a 
connection (402) with the target Web Server to 
retrieve a Web page at the target URL (Universal 
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Resource locator) . The probe program establishes 
the connection according to the way it has been 
configured e.g. via an HTTP Proxy Server, via a 
Socks Server (Gateway) or directly. 
If the attempt for establishing the connection is 
unsuccessful, the probe program immediately goes 
into error mode (408). An error value is returned 
(407) by the probe program indicating that the con- 
nection is not possible. 

If the attempt for establishing the connection is suc- 
cessful, then the Web page (403) is retrieved by the 
probe program. 

The probe program then closes the connection 
(404) pursuant to the normal HTTP protocol proce- 
dure. 

To ensure that the Web page has been correctly 
retrieved, the probe program then searches for 
known keywords (405) that are expected to be in 
the Web page. 

If the keyword is found (406) in the Web page, then 
the Web page retrieval is successful. The timer is 
stopped and the correct response time for the oper- 
ation is returned. By storing and integrating a short 
historic of the measured response time over time, 
the probe program can detect and return any 
response time degradation, thus enabling an antic- 
ipation of the Proxy/Socks Servers failures. 
If however the correct keyword is not found (407) in 
the Web page, then the Web retrieval is unsuccess- 
ful and again an error value is returned. Hie type of 
event that might trigger this sort of error is when the 
connection is successfully established but Web 
page with an error is retrieved. 
The action whereby the probe goes into retry mode 
(409) occurs only when the probe is configured to 
try multiple destination URL's as opposed to a sin- 
gle URL This adds some robustness to the testing 
of the probe and hence insulates it somewhat from 
one-off network "glitches" (e.g. dropped connec- 
tions etc.). 

PHYSICAL VIEW OF AVAILABILITY AND 
RESPONSE TIME PROBES EXTERNAL FLOWS 

[0044] The probes are used by various components 
and in various flows (Figures 5 and 6) in order to provide 
the Web Browser with the best Proxy/Socks Server. The 
data gathered by the probes are indirectly downloaded 
to the Web Browser by using an autoproxy mechanism. 
The present invention allows a software implementation 
with no additional or specific hardware. 
[0045] The output from the probe is stored on the 
autoproxy URL system as shown in Figure 7 and used 
to create the autoproxy code (Javascript code in a pre- 
ferred embodiment) . There is no extra process inside 
the code. Web Browser performances are not degraded 
because availability and response time probes are not 
processed within the downloaded autoproxy code 



(Javascript code) but in the autoproxy URL system. 
[0046] A CGI (Common Gateway Interface) pro- 
gram dynamically creates the autoproxy code as shown 
in Figure 8 with the availability and response time infor- 
5 mation provided by probes. The use of response time 
and availability criteria for selecting a Proxy/Socks 
Server by the probes is fully compatible, and can be 
combined, with existing criteria such as client's origin IP 
subnet. 

10 [0047] The use of response time and availability cri- 
teria also provides a proactive Proxy/Socks Servers fail- 
ure detection through the integration of response time 
degradation. The Web Browser can be periodically and 
dynamically updated with a new selection of the "best" 

15 Proxy/Socks Server using : 

"refresh" tag in the autoproxy code, 
external code (or Java applet), or 
a new feature in the Web Browser for periodically 
20 and automatically refreshing the autoproxy code. 

[0048] Another positive consequence is the minimi- 
zation of the useless traffic to failing Proxy/Socks Server 
since Proxy/Socks Servers are excluded from list of 

25 available target servers upon failure detection. Since an 
autoproxy mechanism is used, there is no need for man- 
ually updating the manual proxy configuration in the 
Web Browser in case of Proxy/Socks Server failure. 
Proxy/Socks Servers names or locations don't need to 

30 be known and configured by the end user, thus provid- 
ing for instance a seamless service for mobile users. 
[0049] Figure 5 is a physical view of the logical envi- 
ronment described in Figure 3. The Web Browser (501) 
attached to the Intranet (502) is configured to use an 

35 autoproxy URL to determine which Proxy/Socks Server 
(Firewall) (503) to use for having access to the Internet 
(504) and the destination Web Server (505) . The sys- 
tem where the autoproxy URL resides (506) also runs 
the availability and response time probes (507) corrf ig- 

40 ured to test the Proxy/Socks Servers. The autoproxy 
URL uses the CGI (common Gateway Interface) (508) 
to dynamically generate the autoproxy code of the auto- 
proxy URL The autoproxy code is based on the infor- 
mation gathered by the availability and response time 

45 probes. In this way the Web Browser is configured with 

an available Proxy/Socks Server, and 

what is deemed the best Proxy/Socks Server. 

so DATA FLOWS OF AVAILABILITY AND RESPONSE 
TIME PROBES 

[0050] Figure 6 is a view of the actual data flows 
associated with the entities depicted in Figure 5. Again, 
55 the Web Browser (601) attached to the Intranet (602) is 
: configured to use an autoproxy URL to determine which 
.; Proxy/Socks Server (Firewall) (603) to use for having 
/ access to the Internet (604) and the destination Web 
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Server (605) . The Web Browser has access (609) to 
the autoproxy URL system (606) to first determine 
which Proxy/Socks Serverl it should use. The Web 
Browser can be periodically and dynamically updated 
with the "best" Proxy/Socks Server using : 5 

the "refresh" tag in the autoproxy code, 

an external code (or Java applet) , or 

a Web Browser new feature for periodically and 

automatically refreshing the autoproxy code. 10 

[0051] The system where the autoproxy ULR 
resides and which runs the availability and response 
time probes (607), uses the CGI (Common Gateway 
Interface) (608) to dynamically generate the autoproxy is 
code (610) of the autoproxy URL. The autoproxy code is 
based on the information gathered by the availability 
and response time probes that have tested the 
Proxy/Socks Servers via the HTTP survey flows (61 1 
and 612) described in Figure 3. In this way the Web 20 
Browser ends up with what is deemed the best 
Proxy/Socks Server. 

INTERNAL STORAGE OF AVAILABILITY AND 
RESPONSE TIME PROBES 2S 

[0052] Figure 7 depicts the internal storage within 
the autoproxy URL system of the information retrieved 
by the availability and response times probes (701) . 
Each probe updates (702) a table in the autoproxy URL 30 
system (703) with the measurements of each 
Proxy/Socks Server it tests. In this way the table con- 
tains the current state of all Proxy/Socks Servers that 
are candidate to be selected and used by the Web 
Browser. At configurable or periodical time intervals, 35 
probes test again the Proxy/Socks Servers (704) and 
the cycle again is repeated. 

PROGRAM RUNNING AT AUTOPROXY URL SYS- 
TEM 40 



Proxy/Socks Server for the client system (Web 
Browser) based on both the IP address of the Web 
Browser (obtained as a CGI variable) and the infor- 
mation generated by the availability and response 
time probes for each Proxy/Socks Server and 
stored in the table of the autoproxy URL system 
(807). The IP address is used to add a geographical 
criteria to the Proxy/Socks Server selection. For 
instance, if two Proxy/Socks Servers provide the 
same response time (one in the US. the other one 
in Europe), the closest Proxy/Socks Server is pre- 
ferred (the one in Europe if the Web Browser is in 
Europe). 

To improve the robustness of the Proxy/Socks 
Server selection, the CGI program selects a best 
"backup" Proxy/Socks Server for the Web Browser 
(804) . This "backup" Proxy/Socks Server is auto- 
matically used by the Web Browser after it times out 
attempting to use what it thinks is the "best" 
Proxy/Socks Server. Again this "backup" 
Proxy/Socks Server is selected using both the IP 
address of the Web Browser (obtained as a CGI 
variable) and the information generated by the 
availability and response time probes for each 
Proxy/Socks Server and stored in the table of the 
autoproxy URL system (807). 
Once the CGI program has selected the best and 
backup Proxy/Socks Servers, it created the auto- 
proxy code (805). This code is generally made of 
Javascript language. 

Once the autoproxy code has been created, the 
autoproxy URL system downloads it to the Web 
Browser (806) via standard HTTP protocol as any 
other output of a CGI program. 

While the invention has been particularly shown and 
described with reference to a preferred embodiment, it 
will be understood that various changes in form and 
detail may be made therein without departing from the 
spirit, and scope of the invention. 



35 



[0053] Figure 8 again refers to the internal logic of 
the program running on the autoproxy URL system. 

The autoproxy URL system is initially contacted by 45 
a Web Browser (801) wanting to "know" which is 
the best Proxy/Socks Server (or Firewall) to use. 
This is for instance achieved by selecting the Auto- 
matic Proxy Configuration option in the Web 
Browser and by providing information such as the so 
URL of the autoproxy code. 
The autoproxy URL system activates (802) the CGI 
(Common Gateway Interface) program (via the 

. Web Server CGI extensions) . The CGI program 
has access to all standard CGI variables including 55 

| the IP (Internet Protocol) address of the Web 

! Browser 

• ; The CGI program selects (803) the best 



Claims 

ji . A method for dynamically selecting a firewall server 
(603) for a web client (601), in particular a web 
browser (601). in a Transmission Control Proto- 
col/Internet Protocol (TCP/IP) network comprising 
a plurality of firewall servers (503), said method 
comprising the steps of : 

measuring performance and availability of 
each firewall server (603) using measurement 
probes (607); 

dynamically selecting a firewall server accord- 
ing to the performance and availability meas- 
urements (607) . 
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The method according to the preceding claim 
wherein the step of measuring the performance 
and availability of each firewall server (603) using 
measurement probes (607) comprises the further 
step of: 



dynamically creating a configuration file based 
on the performance and availability measure- 
ments, preferably in Javascript language, on 
said universal resource locator (URL) system 
(606) for selecting said firewall server (603). 



measuring the response time needed for 
retrieving from a web server (605) known infor- 
mation, in particular one or a plurality of known 
web pages, through each firewall server (603); 10 

The method according to the preceding claim 
wherein the step of measuring the response time 
comprises the further steps of: 

15 

establishing (402) a connection with the web 
server (605) through each firewall server (603); 

retrieving (403) the one or a plurality of known 
web pages from the web server (605); 20 

checking (405) that the retrieved one or plural- 
ity of web pages contain one or a plurality of 
known keywords. 

25 

The method according to any one of the preceding 
claims wherein the step of measuring the perform- 
ance of each firewall server (603) using measure- 
ment probes (607) comprises the further step of: 

30 

comparing for each firewall server said meas- 
ured response time with previous measured 
response times; 

determining for each firewall (603) the degra- 35 
dation or the amelioration of the measured 
response time. 

The method according to any one of the preceding 
claims wherein the step of measuring the availabil- 40 
ity of each firewall server using measurement 
probes (607) comprises the further step of: 

detecting failures on each firewall server; 

45 

excluding firewall servers in failure from the 
step of selecting a firewall server. 

The method according to any one of the preceding 
claims wherein said firewall server (603) is a proxy so 
server (304) or/and a socks server (31 1). 



8. The method according to any one of the preceding 
claims wherein the step of dynamically creating a 
configuration file is processed by a common gate- 
way interface (CGI) (608) on said universal 
resource locator (URL) system (606). 

9. The method according to any one of the preceding 
claims wherein the step of selecting a firewall 
server (603) comprises the further step of: 

downloading the configuration file from the uni- 
versal resource locator (URL) system (606) to 
the web client, in particular to the web browser 
(601). 

10. The method according to any one of the preceding 
daims wherein the steps of measuring perform- 
ance and availability and of dynamically selecting a 
firewall server (603) are periodically processed in 
the universal resource locator (URL) system (606) 
and the configuration file created by the common 
gateway interface (608) (CGI) is periodically down- 
loaded to the web client (601). 

11. The method according to any one of the preceding 
claims comprising the further steps of: 

• pre-selecting a backup firewall server (603) in a 
background process; 

switching to said backup firewall server in case 
of failure of the selected firewall server. 

12. The method according to any one of the preceding 
claims wherein step of selecting a firewall server 
according to performance and availability measure- 
ments comprises the further step of: 

• selecting the firewall server according to the 
Internet Protocol (IP) address. 

13. A system comprising means adapted for carrying 
out the method according to any one of the preced- 
ing claims. 



The method according to any one of the preceding 
claims comprising the further steps of: 

processing performance and availability meas- 
urements (607) from a single universal 
resource locator (URL) system (606); 
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General physical view of a end user accessing the World-Wide- Web 
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Logical view of availability and response time probes external flows 
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Flow chart of internal logic of availability and response time probe 
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Physical view of availability and response time probes external flows 
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Data flows of availability and response time probe 
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Internal storage of WWW availability and response time probes 
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Flow chart of the program running at autoproxy URL system 
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